Long-read sequencing of the human cytomegalovirus transcriptome with the Pacific Biosciences RSII platform
نویسندگان
چکیده
Long-read RNA sequencing allows for the precise characterization of full-length transcripts, which makes it an indispensable tool in transcriptomics. The human cytomegalovirus (HCMV) genome has been first sequenced in 1989 and although short-read sequencing studies have uncovered much of the complexity of its transcriptome, only few of its transcripts have been fully annotated. We hereby present a long-read RNA sequencing dataset of HCMV infected human lung fibroblast cells sequenced by the Pacific Biosciences RSII platform. Seven SMRT cells were sequenced using oligo(dT) primers to reverse transcribe poly(A)-selected RNA molecules and one library was prepared using random primers for the reverse transcription of the rRNA-depleted sample. Our dataset contains 122,636 human and 33,086 viral (HMCV strain Towne) reads. The described data include raw and processed sequencing files, and combined with other datasets, they can be used to validate transcriptome analysis tools, to compare library preparation methods, to test base calling algorithms or to identify genetic variants.
منابع مشابه
Corrigendum: Long-read sequencing of the human cytomegalovirus transcriptome with the pacific biosciences RSII platform
This corrects the article DOI: 10.1038/sdata.2017.194.
متن کاملGenetic Adaptation of Porcine Circovirus Type 1 to Cultured Porcine Kidney Cells Revealed by Single-Molecule Long-Read Sequencing Technology
Porcine circovirus type 1 (PCV1) is a nonpathogenic circovirus, and a contaminant of the porcine kidney (PK-15) cell line. We present the complete and annotated genome sequence of strain Szeged of PCV1, determined by Pacific Biosciences RSII long-read sequencing platform.
متن کاملStrain Kaplan of Pseudorabies Virus Genome Sequenced by PacBio Single-Molecule Real-Time Sequencing Technology
Pseudorabies virus (PRV) is a neurotropic herpesvirus that causes Aujeszky's disease in pigs. PRV strains are widely used as transsynaptic tracers for mapping neural circuits. We present here the complete and fully annotated genome sequence of strain Kaplan of PRV, determined by Pacific Biosciences RSII long-read sequencing technology.
متن کاملClustering of Short Read Sequences for de novo Transcriptome Assembly
Given the importance of transcriptome analysis in various biological studies and considering thevast amount of whole transcriptome sequencing data, it seems necessary to develop analgorithm to assemble transcriptome data. In this study we propose an algorithm fortranscriptome assembly in the absence of a reference genome. First, the contiguous sequencesare generated using de Bruijn graph with d...
متن کاملMetagenomic and near full-length 16S rRNA sequence data in support of the phylogenetic analysis of the rumen bacterial community in steers
Amplicon sequencing utilizing next-generation platforms has significantly transformed how research is conducted, specifically microbial ecology. However, primer and sequencing platform biases can confound or change the way scientists interpret these data. The Pacific Biosciences RSII instrument may also preferentially load smaller fragments, which may also be a function of PCR product exhaustio...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره 4 شماره
صفحات -
تاریخ انتشار 2017